Feature selection for Support Vector Machines via Mixed Integer Linear Programming

نویسندگان

  • Sebastián Maldonado
  • Juan Pérez
  • Richard Weber
  • Martine Labbé
چکیده

The performance of classification methods, such as Support Vector Machines, depends heavily on the proper choice of the feature set used to construct the classifier. Feature selection is an NP-hard problem that has been studied extensively in the literature. Most strategies propose the elimination of features independently of classifier construction by exploiting statistical properties of each of the variables, or via greedy search. All such strategies are heuristic by nature. In this work we propose two different Mixed Integer Linear Programming formulations based on extensions of Support Vector Machines to overcome these shortcomings. The proposed approaches perform variable selection simultaneously with classifier construction using optimization models. We ran experiments on real-world benchmark datasets, comparing our approaches with wellknown feature selection techniques and obtained better predictions with consistently fewer relevant features.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Sparse SVM for Feature Selection on Very High Dimensional Datasets

A sparse representation of Support Vector Machines (SVMs) with respect to input features is desirable for many applications. In this paper, by introducing a 0-1 control variable to each input feature, l0-norm Sparse SVM (SSVM) is converted to a mixed integer programming (MIP) problem. Rather than directly solving this MIP, we propose an efficient cutting plane algorithm combining with multiple ...

متن کامل

Feature Selection for Multiclass Discrimination via Mixed-Integer Linear Programming

We reformulate branch-and-bound feature selection employing L1 or particular Lp metrics, as mixed-integer linear programming (MILP) problems, affording convenience of widely available MILP solvers. These formulations offer direct influence over individual pairwise interclass margins, which is useful for feature selection in multiclass settings.

متن کامل

Strongly agree or strongly disagree?: Rating features in Support Vector Machines

In linear classi ers, such as the Support Vector Machines (SVM), a score is associated with each feature and objects are assigned to classes based on the linear combination of the scores and the values of the features. Inspired by discrete psychometric scales, which measure the extent to which a factor is in agreement with a statement, we propose the Discrete Level Support Vector Machines (DILS...

متن کامل

Multi-category Support Vector Machines, Feature Selection and Solution Path

Abstract: Support Vector Machines (SVMs) have proven to deliver high performance. However, problems remain with respect to feature selection in multicategory classification. In this article, we propose an algorithm to compute an entire regularization solution path for adaptive feature selection via L1-norm penalized multi-category MSVM (L1MSVM). The advantages of this algorithm are three-fold. ...

متن کامل

Simultaneous Feature Selection and Classifier Training via Linear Programming: A Case Study for Face Expression Recognition

A linear programming technique is introduced that jointly performs feature selection and classifier training so that a subset of features is optimally selected together with the classifier. Because traditional classification methods in computer vision have used a two-step approach: feature selection followed by classifier training, feature selection has often been ad hoc, using heuristics or re...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Inf. Sci.

دوره 279  شماره 

صفحات  -

تاریخ انتشار 2014